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Abstract 

Cumulative sum (CUSUM) charts are typically used to detect changes in a stream of obser- 
vations e.g. shifts in the mean. Usually, after signalling, the chart is restarted by setting it to 
some value below the signalling threshold. We propose a non-restarting CUSUM chart which is 
able to detect periods during which the stream is out of control. Further, we advocate an upper 
boundary to prevent the CUSUM chart rising too high, which helps detecting a change back 
into control. We present a novel algorithm to control the false discovery rate (FDR) pointwise 
in time when considering CUSUM charts based on multiple streams of data. We prove that the 
FDR is controlled under two definitions of a false discovery simultaneously. Simulations reveal 
the difference in FDR control when using these two definitions and other desirable definitions 
of a false discovery. 
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1 Introduction 



One of the most widely used control charts is the cumulative sum (CUSUM) chart suggested by 
Page (1954), which in its simplest form is defined as follows. Consider observing a stream Xt, 
t € N = {1,2,...} of independent random variables. Suppose when in control Xt ~ A^(0, 1). 
Assume that after an unknown time 7 G [0, cxd], the observations switch to an out-of-control state 
where Xt ~ N{A, 1) for some known A > 0. Then the classic CUSUM chart is 



St = max{St-i +Xt- A/2, 0), So = 0. 



(1) 



The chart signals a change at the hitting time inf{t > 0; > C} for some threshold (" > 0. Hawkins 



and Olwell (1998) give a detailed background of CUSUM charts and their applications. 



CUSUM charts were originally designed for industrial settings, quoting Page (1954): [Process 



inspection schemes are] "required to detect a deterioration in the quality of the output from a 
continuous process. When such a deterioration is suspected some action is taken; for example, the 
production may be suspended and a machine reset." This explains why, once a CUSUM chart 
crosses the threshold (, it is typically restarted at 0. Restarting at a different value such as C/2 



has also been suggested (Lucas and Crosier, 1982). 



In this paper we are concerned with monitoring multiple data streams in situations where 
restarting is not possible, e.g. a medical setting where each stream relates to the performance of a 
hospital. Even if we suspect a deterioration of performance, it is unlikely that the hospital would 
close or suspend treatment of patients. Moreover, we are interested in scenarios where streams can 
switch, potentially multiple times, between an in-control state and an out-of-control state. The 
setting of monitoring multiple streams of observations has recently become a topic of increasing 
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interest (Mei 



2010 



Li and Tsung, 2009), in particular in medical settings ( Spiegelhalter et al. 



2012 Bottle and Aylin, 2008 Biswas and Kalbfleisch , 2008). 



We propose a novel algorithm to control the false discovery rate (FDR) of multiple data streams 
pointwise in time. To monitor these data streams we suggest using non- restarting CUSUM charts 
with an upper boundary. A non-restarting CUSUM chart continues when its threshold is crossed. 
This leads to periods during which the stream is considered to be out of control. Moreover, we 
impose an upper boundary on the chart which improves detection when the chart comes back in 
control. 

In this algorithm, a false discovery would naturally be defined as signalling the stream to be 
out-of-control when in fact the observations have been in-control since the start. We prove in 
Theorem [l] that the algorithm simultaneously controls the FDR for the following less restrictive 
definition of false discovery: signalling the stream to be out-of-control when in fact the observations 
have been in-control since the last time the chart was at 0. 

Previous work concerning FDR control procedures in statistical process control settings goes 



back to Benjamini and Kling (1999) and Benjamini and Kling (2007). Grigg and Spiegelhalter 



( 2008 ) considered monitoring normally distributed streams of observations through CUSUM charts 
that are restarted after a signal. Li and Tsung (2009) propose a method to control the FDR over 
the stages of a multistage process. They apply a FDR control procedure on a single unit over the 
stages of production with the aim of finding a faulty stage. This differs from our aim which is the 
control the FDR pointwise in time across multiple units. In Mei (2010) a method is proposed using 
a global false alarm constraint across multiple streams of data. However, the setting considered 
only allows for one global time at which some of the data streams change from the in-control state 
to the out-of-control state. 

Our contributions to this area are to focus on a situation where restarting is not possible, to 
modify the CUSUM chart to enable it to signal periods of in-control and out-of-control observations, 
and to discuss the meaning of a false discovery in this setting. 



2 Non-Restarting CUSUM Charts with an Upper Boundary 

We now present the general setting and CUSUM charts we shall be using. Consider a stream of 
independent real- valued random variables Zi, Z2, . . . with distribution functions Fi,F2, . . . respec- 
tively. At time t, the random variable, Zt, is in control if Ft = F* and out of control if Ft 7^ F* , 
for some known in-control distributions Fj*, -F|, . . . . We consider extensions of the CUSUM charts 



(Page, 1954) of the form 

St = (p[mm{max{St-i + Zt,0) ,h}] , So = 0, (2) 

where (p is a non-decreasing function and /i > is a constant specifying an upper boundary. 

The classic CUSUM chart ([T]) reduces to ^ by using Zt = Xt — A/2, with in control distribution 



N{—A/2, 1), /i = 00 and ip{x) = x. Another example is the loghkehhood CUSUM (Moustakides 



1986D chart 

St = max[St_i + log {h{Xt) / fo{Xt)} ,0], 5o = 0, 

where /o and /i are the probability density functions of the in-control and out-of-control distribution 
respectively. Again this reduces to ^ by letting Zt = log{/i(A'()//o(Xt)}, h = 00 and (p{x) = x. 

We include (p in ^ to allow CUSUM charts in which, at every step, St is rounded to finitely 
many values. For these charts we can compute the exact distribution of St at a fixed t using Markov 
chains (Brook and Evans, 1972). This is discussed further in Section |4j 
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Time 



Figure 1: Graph of a CUSUM chart with no upper boundary (dot-dash), with upper boundary 
h = 10 (dashed) and with a restarting threshold ^ = h/2 = 5 (sohd). The grey box represents the 
times at which the observations are truly out-of-control. 



We propose not restarting the chart once its threshold is crossed. Instead, as long as the chart 
is above the threshold, we say it signals continuously until it drops back below the threshold. This 
will allow us to detect periods where the observations are in or out of control. To avoid the chart 
climbing very high above the threshold, which may make detecting that the stream is back in 
control difficult, we impose the upper boundary h > 0. This is important in our setting where the 
observations can switch in and out of control multiple times. 

To compare the non-restarting CUSUM chart to other charts, consider the CUSUM chart ^ 
with in-control distribution 7V(— 1/2, 1) and out-of-control distribution N(l/2, 1) with h = W and 
^p{x) = X. We compare this to the same CUSUM chart with no upper boundary (h = oo) and a 
restarting CUSUM chart which resets to zero when the threshold ( = h/2 = 5 is crossed. Figure 
[T] shows CUSUM charts over 100 time points, where the observations are out-of-control from time 
20 to 60 represented by the grey box. 

All charts are identical until they reach the threshold C for the first time. The non-restarting 
chart signals from time 33 to 66. So the out-of-control signal stops a few steps after the stream 
has returned to the in-control state. The restarting chart then signals at times 33, 37, 49, 56. The 
main downside of this is that it does not suggest a period where the stream is out-of-control and, 
importantly, there is no signal that the out-of-control period has ended. The boundary-free chart 
signals from 33 to 86. Clearly this lasts considerably longer than the out-of-control period. This is 
mainly due to the high values attained during the out-of-control period. 



3 False Discovery Rate 



3.1 Control of False Discovery Rate in Multiple Testing 

We now consider monitoring multiple data streams using a non-restarting CUSUM chart with 
upper boundary (Section [2]) for each stream. Instead of using a fixed threshold ( to determine 
which streams are out of control, we suggest using an FDR control procedure. We first briefly 



review the procedure developed by Benjamini and Hochberg (|1995). 
Consider testing N null hypotheses H^, H2 



. . . , simultaneously. Denote the number of true 
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null hypotheses by itlq. Let V be the number of true null hypotheses declared significant and R be 
the total of null hypotheses declared significant. Define Q = V/ R as the proportion of the rejected 
null hypotheses which are incorrectly rejected, with the convention 0/0 = 0. The FDR is then 
defined as E{Q). 

Suppose we have independent tests with corresponding p- values Pi, P2, . . . , Pn for the hy- 
potheses. The following algorithm proposed by |Benjamini and Hochberg (1995) ensures the FDR 
is less than a pre-specified constant q* G (0, 1). 



Algorithm 1 (Control of the FDR at G (0, 1)) 

1. Order the p-values as < P(2) ^ " " " ^ P{n)! where P^^^ corresponds to 

2. Let k be the largest i for which Pu\ < jrQ* . 



3. Reject i/J^ for i = 1,2, ... ,k. 



(Benjamin! and Yekutieli 2001, Th.5.1) that the p-values satisfy 



This procedure controls the FDR at q* i.e. E{Q) < (mo/N)q* < q*. The procedure requires 

p-values satisfy 

< -,0- {k = 0,...,N;i = l,2,...,N), (3) 



pr < 



k 



N 



which is satisfied when Pi is computed conditionally on being true (Lehmann and Romano 



2005, pg. 64, Lemma 3.3.1). The allocation of which null hypotheses are true can be random, and 
the FDR conditional on this allocation will still be controlled. 

Based upon the above method, other FDR control procedures have been developed, e.g. the 



two-step FDR control procedure (Benjamini et al. , 2006, Def. 6), the adaptive linear step-up 



procedure (Benjamini et al. 2006 Def. 3) and the adaptive step-down procedure (Gavrilov et al 



2009). These other procedures involve estimating mo, by mo say, before applying the Benjamini 



and Hochberg (1995) procedure at level q*N/fho. 



3.2 Algorithm 

We wish to control the FDR at each time point using CUSUM charts for multiple streams. We 
first state the algorithm before precisely defining a false discovery in our setting. 

Suppose we observe independent streams of observations (Zj^t)teN {i = ^, ■ ■ ■ , Each Zi^t 
has distribution function Fi^t with Fi^t = F*^ when Zi^t is in-control and Fi^t 7^ Fl^ when Zi^t is 
out-of-control. All F^^ are assumed to be known. For each stream (Zj^t)^^^ we run a non-restarting 
CUSUM chart Si^t with upper boundary h according to ([2]). 

We propose the following algorithm to control the FDR at level q* G (0, 1) at each time t. Any 
FDR control procedure that controls the FDR at q* if ^ is guaranteed, can be used. These include 
the aforementioned two-step, adaptive linear step-up and adaptive step-down procedures. 

The following algorithm is written for the homogeneous case where F*^ = F^ for all i. 

Algorithm 2 (Control of the FDR at q* G (0, 1) at a fixed time t) 

1. Let (<S'*)^gN be a chart with all observations in control, i.e. Fy = F* for all v. Compute the 
distribution of S* and let P{s) = pr(5'^* > s). 

2. For the observed streams (i = 1, . . . ,N) compute the p-values Pi^t = P{Si^t)- 
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3. Apply the chosen FDR procedure with level q* to the p-values Pi^t,---PN,t ■ The rejected 
streams are signalled to he out- of- control. 

It is straightforward to adapt this to the general case, where each stream can have a different in- 
control distribution or a different upper boundary, by computing the p-values separately for each 
stream. 

If we use in ([2]) to force the chart to take only finitely many values then Step 1 can be 
accomplished using Markov chains. Otherwise, P{s) can be approximated through various methods 



such as a finite-state Markov chain approximation (Brook and Evans, 1972) or use of the steady 



state distribution of the CUSUM chart ( Grigg and Spiegelhalter , 2008 ) . 



3.3 Null Hypothesis: In-Control Since Start 



In this section we show that Algorithm [2] in Section 3.2 controls the FDR at a fixed time t if a false 



discovery is defined as: a stream that signals out-of-control at time t, when it has in fact been in 
control since time 0. 

To phrase this in the language of hypothesis testing, the null hypotheses are 

Hlt = {Fi,u = Fl^ foranO<i/<t} {i = l,...,N). (4) 

A null hypothesis H^^ is declared significant when it is rejected by the FDR control procedure. 
Thus, at each time t G N° = {0, 1, 2, . . . }, 

V = i^{i:Fi^^ = Fl^ for aU < 1/ < t. Hi I is significant} and — # {significant hypotheses} . 

The p-values are computed in agreement with the null hypotheses Q. Thus condition ([s]) holds 
and our algorithm (Algorithm [2] in Section [312]) controls the FDR at q* , i.e. E{Q) = E {V/R) < q*. 



3.4 Null Hypothesis: In-Control Since Visiting 

The definition of a false discovery in the previous section implies that all discoveries made after a 
stream goes out of control for the first time are considered true discoveries. Thus a signal for a 
stream that has been out-of-control and then comes back in-control will never be considered a false 
discovery, no matter how long it has already been back in control. 

In this section we show that Algorithm [2| without changing in the way the p-values are com- 
puted, also controls the FDR when a false discovery is defined as: a stream being signalled out- 
of-control at time t, when it has been in control since its chart was at 0. The corresponding null 
hypotheses are 

Hft = {there exists r G {0, . . . , t} : ^i,^ = 0, Fi^y = F*„ for ah r < < (i = 1, . . . , N). 

Thus, 

I is significant and there exists r: Si^r — 0, Fi^^ — F*^, for all r < < t|. 

The definitions of declared significant and R remain the same as before. The p-values are computed 
as before. The following theorem shows that ([s]) is satisfied and thus the Benjamini and Hochberg 



(|1995j) FDR procedure still controls the FDR. 

Theorem 1 For all x G [0, 1] and for t G , 

piiPi^t < X \ Hit) < X {i = l,...,N). 

The proof can be found in Appendix 1. To summarize Theorem [T| the FDR with respect to both 
sets of hypotheses, H^^ and H^^, is being controlled simultaneously. 
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4 Simulations 



In this section we demonstrate the performance of our proposed method (Algorithm 2) under 



different definitions (Section 3.3 and Section 3.4) of a false discovery via simulations. 

For each stream, we construct a CUSUM chart according to ([2]). In this simulation we let 
F*^ ~ A^(— 1/2,1) and Fi^t ~ ^(1/2,1) when out-of-control, for ah G N and set the upper 
boundary /i = 10. 



To compute the in-control CUSUM chart distribution, 5^*, we use Brook and Evans (1972) 
method. If the chart is forced to take only finitely many values, by using the function ^p in ([2]), 
then the distribution can be computed exactly, as it is just the distribution of a finite-state Markov 
chain. We proceed by partitioning [0, h] into the M + 1 states by using 



if{x) 



X G [0,wi) 

{wj +Wj-.i)/2 x£[wj-i,Wj) (j 

h X e [wm, h] 



.M) 



where w 



3 - 5) for j 

For each iteration we took 



1,...,M. 

100 streams over a period of 100 time points and partitioned 
[0, fi\ into 100 states with q* = 0.05. A discrete time-homogeneous Markov chain is used to simulate 
the observations, for all charts, moving from in-control to out-of-control and vice versa. This 



Markov chain is defined by the transition probabilities pr(i<i^j4_i 



TP* 



a and 



pr(Fj^j_i_i 7^ Fit+i I Fi^t = F*f.) = /3 for some known < a, /3 < 1 and for all t > with all streams 
starting in control. In this simulation we let a = 0.01, /3 = 0.07. This simulation was repeated 



10,000 times, using the same seed. We consider the Benjamini and Hochberg (1995), the two-step 



and the adaptive linear step- up FDR control procedures. 

Figure [2a| displays a CUSUM chart from a single iteration. The threshold given by the Benjamini 
and Hochberg] ( [l995 ) FDR control procedure pointwise in time is also displayed. This threshold, 
based upon the remaining 99 charts in the same iteration, is the value which the presented CUSUM 
chart needs to exceed in order to signal out-of-control. 

Figure 2c displays the FDR using these control procedures. All procedures control the FDR 
below q* = 0.05. However, the two-step and the adaptive linear step-up procedures control the FDR 



nearer to q* than the Benjamini and Hochberg (1995) FDR procedure. This is because other FDR 



control procedures estimate mo first, then apply the Benjamini and Hochberg (1995) procedure. 
For the same simulation. Figure 2d displays the FDR under the original hypotheses, Hf^. We see 



the FDR for all the control procedures decreases over time, unlike in Figure l2c) This is explained 



by the lower number of true null hypotheses, mo, at each time point under Hf^ (Figure 2b). 



5 Discussion 



In the simulations in Section [4j we have used ip to force the CUSUM chart to take only finitely 
many states. This ensures that the distribution of in Step 2 of Algorithm [2] can be computed 
exactly and thus the FDR is guaranteed to be controlled. Allowing the CUSUM chart to take 
continuous values, by using (p{x) = x, will no longer guarantee the control of the FDR as Step 2 
of Algorithm [2] can only be done approximately. Further simulations, not reported here, showed 
that the false discovery rate was still controlled when using a Markov chain approximation with a 
reasonably large number of states. These simulations were similar to those in Section [4} 

Ideally, we would like to define a false discovery as signalling out of control at time t when in fact 
the observation is in control at time t, i.e. Fi^t = F*^. This is much stronger than our definitions of 
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Figure 2: (a) Example of a single CUSUM chart (solid) from the simulation with thresholds 
(dashed). The true out-of-control periods are given by the grey areas, (b) Median of mo (solid) with 
95% (dashed) and 50% (dotted) quantile pointwise in time under if?^ (grey) and H^^ (black), (c) 
Estimated FDR for the Benjamini and Hochberg (1995) (dotted), two-step (dashed) and adapjjive 
linear step-up (solid) control procedures with q* = 0.05 using H^^. (d) same as (c) but using H[ 



•■i,f 
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a false discovery - and thus the FDR will not be controlled under this stronger definition. It seems 
reasonable to assume that this FDR will depend on how quickly the observations switch between 
the in-control and the out-of-control state. Investigating this is a topic for further research. 



Appendix 1 

Proof of Theorem 1 

Since each stream is independent we can drop the subscript i. We say a random variable V is 
stochastically smaller than a random variable Y, denoted by V <st Y, if pr(y < x) > pr(y < x) 
for ah X e M. 

We start the proof by showing, by induction on t e N°, that 

St I <st SI (5) 

At time t = 0, we have So = S^ = and pr(^^) = 1, thus ^ holds. 

At time t E N consider the case Ft / . Then = {St = 0} and pr(S't < x \ Hj') = 1 
for ah X £ R. Thus holds for this case. For the case Ft = F^ , first assume ^ holds at 
time {t — 1). Hence, by the recursive definition of St and St in ([2]), and by the persistence of 
stochastic orders under convolution of independent random variables and under action of multiple 



increasing functions (Theorems 1.2.13 and 1.2.17 in Miiller and Stoyan, 2002 pg. 6 and 7), we get 



St I H^_^ = if [min{max(0, St + Zt), h}] \ Hj>_^ <st S^ . Thus it suffices to show 

St I <st St I Hl^. (6) 

As Ft = F^, we have = H^_^^ U {St = 0}. Letting G{x) = w{St < x \ H^), J{x) = pr(5t < x \ 
H^_i) and a = pr(ff{'_i)/ pr(//f), we have 

G{x) = w{{St < X, Hi,} U {St = 0})/ pr(^°) 

= [pr{St < x, Hi,) + pr{St = 0) - pr{St = 0, Hi,)] / pr(^O) 

= { J(x) pt{HI,) + pr{St = 0) - J(0) pt{HI,)]/ pr(^O) 

= aJ{x)-aJ{0) + ^'^^;=^\ (7) 

By setting x = Omi^,we get G(0) = pr{St = 0)/pr{Hl, and so G{x)- G(0) = a (J(x) - J(0)). 

The distribution of St \ Ht is derived from the distribution of St \ Hi, by potentially adding 
mass at before rescaling. Thus < a < 1 and G(0) > J(0). Therefore, G{x)-G{0) > J{x)-J{0). 
Hence, for aU x £ R, G{x) > J{x) + {G(0) - J(0)} > J(x). Thus (|6j) holds. This finishes showing 

Since P{-), defined in Section 3.2 , is a decreasing function, application of P on ([s]) (an extension 



to Theorem 1.2.13 in Miiller and Stoyan, 2002 pg. 6) yields 

Pt\H^>stPt\H^ (tGNO). (8) 

By construction of Pt, we have 

Pt I H? >st U, (9) 
where U is uniformly distributed on [0, 1]. Combining Q and ^ gives 

Pt I Ht >st Pt I Ht >st U. 
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